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Abstract 



In the cyclic-to-random shuffle, we are given n cards arranged in a circle. At step k, we exchange the fc'th card 
along the circle with a uniformly chosen random card. The problem of determining the mixing time of the cyclic- 
to-random shuffle was raised by Aldous and Diaconis in 1986. Recently, Mironov used this shuffle as a model for 
the cryptographic system known as "RC4" and proved an upper bound of 0{n log n) for the mixing time. We prove 
a matching lower bound, thus establishing that the mixing time is indeed of order Q(n\ogn). We also prove an 
upper bound of 0(n log n) for the mixing time of any "semi-random transposition shuffle", i.e., any shuffle in which 
a random card is exchanged with another card chosen according to an arbitrary (deterministic or random) rule. To 
prove our lower bound, we exhibit an explicit complex-valued test function which typically takes very different values 
for permutations arising from the cyclic-to-random-shuffie and for uniform random permutations; we expect that this 
test function may be useful in future analysis of RC4. Perhaps surprisingly, the proof hinges on the fact that the 
function e z — 1 has nonzero fixed points in the complex plane. A key insight from our work is the importance of 
complex analysis tools for uncovering structure in nonreversible Markov chains. 



1 Introduction 

The mixing time of a Markov chain on a finite state space is the number of steps until it is close to its stationary distri- 
bution, starting from an arbitrary state. The mixing time is a key parameter in analyzing random sampling algorithms 
and is of intrinsic interest in probability and statistical physics as well. For many natural Markov chains, if some of the 
randomness is removed from the transition rule, resulting in a "more deterministic" process with the same stationary 
distribution, the chain becomes significantly harder to analyze. Indeed, some of the most challenging problems in 
the field concern the analysis of such "pseudo-random" variants of well understood chains. Some examples include 
the riffle shuffle H14II19I compared to the thorp shuffle |20|, the asymmetric exclusion process |8| compared with its 
systematic scan version 1131 and the comparison between the standard and systematic versions of Glauber dynamics 
for Gaussian fields Ol4ll5ll6l. 

Shuffling by random transpositions is one of the simplest random walks on the symmetric group: given n cards 
in a row, at each step two cards are picked uniformly at random and exchanged. This shuffle was precisely analyzed 
in 1981, see 1121 . In the "cyclic-to-random" shuffle (invented by Thorp |21 1), at step t a uniformly chosen random 
card is exchanged with the card at position t mod n. It is easy to see that this semi-random shuffle still converges to 
the uniform distribution on permutations of n cards. In their landmark 1986 paper on card shuffling |3|, Aldous and 
Diaconis posed as a challenge the analysis of the cyclic-to-random shuffle. More recently, Mironov 1 17 1 related this 
shuffle to the behavior of the RC4 encryption algorithm. Mironov showed that a strong uniform time argument, due 
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to Broder, can be adapted to yield an upper bound of 0(n log n) on the mixing time. He posed as an open problem 
whether this bound is tight. 

In this paper we establish a lower bound of f2(nlogn) for the mixing time of the cyclic-to-random shuffle, thus 
answering the questions posed by Aldous and Diaconis and by Mironov. We also prove a general upper bound of 
0(n log n) on the mixing time of any semi-random transposition shuffle, i.e., any shuffle in which a random card is 
exchanged with another card chosen according to an arbitrary (deterministic or random) rule that may vary at each 
step. Previously, the best available upper bound for such a general process was 0(n 2 ), proved by Pak [18 1. 

To prove the lower bound for the cyclic-to-random transposition shuffle {cr t }, we find an eigenfunction of the 
shuffle that mixes slowly. (This approach was used by Wilson 1221 1231 to prove f2(n 3 logn) lower bounds for the 
shuffle generated by transpositions of adjacent cards and several variants.) First, we determine the eigenvalues of a 
nonreversible renewal Markov chain M on the n-cycle which describes the behavior of a single card. The asymptotics 
for the leading eigenvalues of M depend on the fact that the function e z — 1 has nonzero fixed points in the complex 
plane. We then pick an eigenfunction / for M and use it to construct a test function F, defined on permutations, which 
is a weighted sum of / applied to the locations of all cards. To show that the distribution at time t of F(a t ) is far 
from the distribution of F(a) for a uniform random permutation a, the key is to estimate the variance. The variance 
is a sum of correlations between pairs of cards; to bound these correlations, we couple the shuffle with a system of 
independent particles evolving according to M. This coupling approach has intuitive appeal, and could potentially 
be used for other chains on permutations. Alternatively, one could bound the variance of F(at) using the martingale 
decomposition method of Wilson 11221 1231 . 

Our general upper bound for semi -random transpositions is proved via a strong uniform time argument, extending 
earlier arguments of Broder and Mironov. 

We believe that some of our technical insights may be carried over to other situations where lower bounds for 
nonreversible or "pseudo-random" Markov chains are sought. These insights include: 

• The analysis of a given Markov chain with a transition rule that varies in time can sometimes be reduced to the 
analysis of an equivalent time-homogeneous chain. 

• Coupling arguments, which are often applied to obtain upper bounds for mixing times, can also be used to 
establish lower bounds. 

• When seeking to understand a nonreversible Markov chain, results of classical complex analysis such as Rouche's 
theorem, can be powerful tools. Thus methods from complex analysis should be added to techniques from prob- 
ability, combinatorics, functional analysis and representation theory in the toolkit of Markov chain analysis. 

1.1 Statement of main results 

Let {Lt}^i be a sequence of random variables taking values in [n] = {0, 1, . . . , n— 1} and let {Rt}1*Li be a sequence 
of i.i.d. cards chosen uniformly from [n]. The semi-random transposition shuffle generated by {L t } is a stochastic 
process {a^ }^ on the symmetric group S n , defined as follows. Fix the initial permutation ctq. The permutation a* t 
at time t is obtained from a^_ 1 by transposing the cards at locations L t and R t . 

In the cyclic-to-random shuffle, the sequence L t is given by L t = t mod n. 

The stochastic process {<7j } is a time-inhomogeneous Markov chain on S n , and converges to the uniform station- 
ary distribution for any Ctq and any choice of {L t }. It is a time-homogeneous Markov chain if the L t are i.i.d. The 
special case where the L t are i.i.d. uniform is the random transposition shuffle I2ll3l ll2l . the random walk on S n 
generated by all transpositions; at the other extreme, if all the L t are identically 0, we get the random walk generated 
by "star transpositions", where in each step a randomly chosen card is exchanged with the card in position 0. 

Let /x| be the distribution of cr t * at time t, and let — W||tv denote the total variation distance between /xj? and 
the uniform distribution U. Define the mixing time by 

Tmix = maxmin{i : \\fi* t -U\\tv < ^-}- 

The choice of constant ^ ensures that, for any e > 0, we have \\fj,^ — U\\tv < e if t > [log e _1 ]r m i x (see |2 1). 
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Theorem 1.1 The cyclic-to-random transposition shuffle has mixing time Q(n log n). More precisely, the mixing time 
is at least 

n log n 

where £ is any nonzero complex root of the equation ip{z) = e z — z — 1 = 0. 

Using Mathematica, we find the root £ = 2.088... + 7.461... X i of ip. This gives |1 + (\ = 8.075... and yields a lower 
bound of (.123 + o(l))nlogn for the mixing time. 

Theorem 1.2 The semi-random transposition shuffle {af } generated by any sequence {L t }, has mixing time at most 
0(n\ogn). More precisely, there is a constant Cq such that for any C\ > Co and any initial configuration (Tq, we 
have 

H/jj — U\\tv < for all t > dnlogn, 

for some (3 = /3(d) > 0. 

Remark. The proof shows that we can take Co = 32#~ 3 + O^ 1 where 9 = e~ 2 (l — e _1 )/2. We do not know the 
minimal value of Co; it cannot be strictly less than 1 because of the star transpositions shuffle, where the mixing time 
is (1 + o(l))n log n, see ITOl . 



2 A lower bound for the cyclic-to-random shuffle 



2.1 The behavior of a single card via renewals 

Fixing a specific card a, it is natural to study the renewal chain on the state space [n] — {0, . . . , n — 1}, where state 
i 6 [n] indicates that the location j of card a satisfies j + i = t mod n. This chain is described by the transition 
matrix M, where for all i £ [n], we have Mqj — l/n and Mj ; i = l/n, while Mj^+i = 1 — 1/n, for all i > 1. (For 
i = n — 1, the last equation reads A/„_i,o = 1 — !/"■) In other words, 
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We will now find the eigenfunctions of the chain, that is, the right eigenvectors of the matrix M. Let / 
(/o, . . . , / n -i) T be such a (column) eigenvector. Then we obtain the following equations: 



n— 1 

J=0 



A/o, 



and, for 1 < i < n — 1, 



-/i + (1 - -)fi+i 
n n 



Xfi (where we denote f n = f ) 



(1) 



(2) 



It is easy to check that, up to scaling, (1, . . . , 1) T is the unique eigenvector corresponding to the eigenvalue A = 1, 
and that (— 1, n — 1, — 1, . . . , — 1) T is the unique eigenvector corresponding to A = 0. 

We now assume that / is a right eigenvector corresponding to an eigenvalue A ^ {0, 1}. Since M is doubly 
stochastic, Q implies that Y^i=o fi = an ^ fo = 0; to verify this, sum Q and the n — 1 equations in 0. 
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Writing yi = f L+ i — f i for 1 < i < n — 1 (recall that /„ = /o) the equation (f2} for i = 1 gives: 

n—l 

For 1 < i < n — 2, subtracting successive equations in (0 yields 

(1 - -)y i+ i = At/i. 
n 

Thus if we set 7 = r^r, then yi = (7 — rzr)/i an d 2/j = 7 J ~ 1 J/i for 2 < j < n — 1. Without loss of generality we 
may assume that /1 = 1. Therefore, 

fc-l fc-l y s fe-i 

/ fc = l + ^y J = l + y 1 ^7^ 1 = l+ < 3 > 

i=i 3=1 \ n J j=1 

for 1 < k < n. Thus 

(n - 1)(1 - 7) A- = (n - (n - 1)7) 7*" 1 - 1 (4) 
for 1 < k < n. Since Y^k=o fi = an ^ /n = /o> we infer that 

n 

= (n - 1)(1 - 7 ) 2 I] A = (n - (n - 1)7) (1 - 7") - "(1 - 7) = 7 - n 7 ™ + (n - l) 7 n+1 . 



fc=i 

Since 7 7^ 0, it follows that 



(n - 1)7" - nf l - L + 1 . (5) 



™-2 . 7 „_1 _ j 



7-1 



Note that this equation has a double root at 7 = 1. We therefore conclude that the eigenvalues A 7^ 0,1 correspond 
(via the relation 7 = ^zj) to the roots 7 =/= 1 of 0. We investigate these roots next. 

2.2 Properties of roots of equation ©. 

Lemma 2.1 All the solutions o/(T5} satisfy I7I < 1. 
Proof: If |7| > 1, then 

|(n-l) 7 "- 1 |>|^7' i | = 

and multiplying by \y — 1| gives 

\(n - 1)7" - (n - 1)7" _1 | > |7" _1 - 1|, 
so 7 cannot be a solution to 0. ■ 

In the other direction, we need to show that has solutions close to 1. We prove: 

Lemma 2.2 There exists a solution of the equation (n — 1)7" — nrf 1 ^ 1 + 1 = which satisfies 

|l-7l<-+0(J_), (6) 
11 n z 

and 

|l_ A |<Ji±H+0(^), (7) 
where £ is any nonzero root of — ^ — 1 = 0, and A = (1 — 1/71)7. 



Proof: By defining ui = 7 _1 , we obtain the equation w" — nui + n — 1 = 0, or oj n + n(l — ui) — 1 = 0. Now write 
u> = 1 + z/nto get the asymptotic equation ip(z) — e z — z — 1 = 0. By Hurwitz's theorem (see every solution £ 
of the equation tp(Q = is a limit of solutions z n of the equations (1 + z n /n) n — z n — 1 = 0. Since uj — 1 = z„/n, 
we obtain 

7 = l-* B /n + 0(-^). (8) 

Therefore 

A = (l-l/n) 7 =l-i^ + 0(4)- (9) 
To get more precise estimates, recall i/j(z) — e z — z — 1 and let tp n (z) = (1 + z/n) n — z — 1. By Taylor expansion, 

|nlog(l + z/n)-z| = ^ + 0(i), 

so in a bounded domain, 

\<p n {z) - *{z)\ = |(1 + z/n) n - e/\ = ^ + 0(±) . 



Below we will prove that the equation — z — 1 = has nonzero roots. Let £ be such a root, then £ is a simple 
root, since ^'(C) = — 1 = C- Thus for z on the circle {\z — £| = 6/n}, we have 

i^)i = i^(oi^+o(4) = ic^+o(4). 

n n z n n z 

On the other hand, for z on that circle, 

|^(z)-^(z)| = ^ + 0(n- 2 ). 
2n 

By Rouche's Theorem (see (Q), it follows that if b > |£e'/2| and n is large enough, then ip n has the same number of 
zeros as tp in the disk {\z — C| < b/n}, namely, exactly one zero. We thus obtain © by (|8j. Similarly, Q follows 
from (|9j. 

The equation ip(z) = e z — z— 1 = has the solution z = 0. In order to show that it has a root z ^ 0, write 

z = a: + iy to get 

e 1 cos y = 1 + x and e 21 sin y = y . 
Solve for x to get x = y cosy/ smy — 1, Inserting this value of x into the second equation we get 

V ( y cos y 

= exp 1 

smy \ smy 

We will find a solution of the form y = 2nm + a, where ir/4 < a < ir/2. Note that if y = 2irm + ir/4, then the 
left hand side of (II Oi is y2y, while the right hand side is exp(y — 1), which is strictly larger than \f2y for all m > 1. 
If, on the other hand, y = 2-ktti + n/2, then the left hand side is y while the right hand side is cxp(— 1), which is 
strictly smaller than y. We conclude that for all integers m > 1, there exists at least one solution y = 2irm + a, where 
7r/4 < a < 7r/2. ■ 




2.3 The test function 

In this subsection we fix an eigenvalue A of M such that |A| > 1 — 0(1 /n), and let / : [n] — > C be a corresponding 
eigenfunction. We will denote the states of the n cards at time t by Ut(0), . . . , a t (n — 1), and assume that at time 
we start with the identity permutation, so ao(i) = i for all i. We emphasize that a t is obtained from a t -\ by first 
transposing the card at state with a uniform random card, and then moving all cards one state up (modulo n). Thus 
for each i, the sequence {cr*(i)}t>o is a Markov chain with transition matrix M. To relate this to the description of the 
cyclic-to-random shuffle {ctj } in the introduction, observe that a* t is obtained from cr t by a rotation of size t mod n. 
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We will focus on the following test function F : S n 

n-l 



n 

i=0 



Since / satisfies X^<Ln f(i) = 0, under the uniform distribution U on S n we have 

E u [F(a)} = 0. 
It is also easy to see that for the cyclic-to-random shuffle, 



1 71 — 1 



n 

i=0 



where || • ||2 denotes the ^2-norm w.r.t. the uniform distribution on [n], i.e., \\f\W = ^ Yn=o l/(*)| 5 
We now calculate the second moment of F{a) under the stationary distribution. 



Lemma 2.3 

u(\F(a)\ 2 )- 1 



E, 

Proof: We have 



n-l 



Vu(\F{a)\ 2 )=^Y,Vu(f{a(i))fWj)))f{3)f{^^ 
The second term in HAl can be evaluated as 

^E E "(i/(^))i 2 )i/«i 2 -^PEi/( 



l)\ 2 . 



l/lli^i^, 3 _ 11/11* 



Now let i ^ j and let n be an independent copy of a. Then 



n ( / \ 1 / 9 \\ E w(l/(^))| 

u[f(cr(i))f(<r(m = — (Vu{f(<T(i))f(v(3))) -E w (j/H*))! 2 ) J = 



Similarly, 

E/w/« = EE/w/«-Ei/«i 2 = ^ii/ii2- 

ij&j i j i 

Therefore the first term in dl4i can be evaluated as 

11/111 S^*,*T7X II/II2 



^E E "[/(^))/(^'))]/w/w = - 2 n E/(j)/(v , , 

n z nr(n—l) i —~ n (n — 1 

Combining (1151 and dl6> . we obtain the result via dl4t 



For later use, we record here a simple variational bound on /: 
Lemma 2.4 There exists a universal constant c such that ||/||oo < 2 /or all n. 
Proof: Recall that (7 — 1| < c/n for some c > 1 and that | — y ] < 1. It follows from Q that, for all k =/= 0, 



fc-i 

| /(fc)i <i + £^iy-i|<i 



On the other hand, for 1 < k < n/2c we have 

i/(fc)i>i--x:iy- 1 i>i-->i (is) 

By we have ||/||<x> < 1 + c. From d 1 iSt t, it follows that 

/i/i\V /a i i 

"•'" 2 - I 2c V2y) J - 2 (2c) 1 /2 
This completes the proof. ■ 



2.4 The second moment of F(a t ). 

We begin with an estimate of the contribution to the second moment from a specific pair of cards. Fix two distinct 
cards, i and j. Denote by Ai (s) = {a s (i) = 0} the event that at step s card i is in state (so it will be transposed with 
a uniform random card in the next step). Let 

t-l 

JVy(i)=£(P[4(«)]+P[^(*)]) 
s=0 

denote the expected number of times s < t where one of cards i, j was at state 0. Since at each step there is exactly 
one card at position 0, we have Yli=o Ss=o P[^' ( s )] = * an d therefore 

^JVy(t) < Int. (19) 



Next, we will couple {at} with a process {(i] t , rj t )}, where r\ and rj are two independent copies of the cyclic-to- 
random shuffle starting from the identity permutation. We will observe the motions of cards i, j in r), rj respectively; 
note that, unlike in a, these two motions are independent. We use the coupling to bound the dependence between the 
cards in a. 



Lemma 2.5 For any two cards i ^ j and all t we have 



E 



< 



^ 9 1 1 J 1 1 oc 



(20) 



Proof: We define inductively a coupling of the process {a t } and the pair process {(iJtjVt)}- If (&s(i)> °"s(i)) 7^ 
(rj s (i) , rj s (j)) then the updates for the a and (rj, rf) are performed independently. Otherwise, we have 

(<r s {i),a s (j)) = (r, s {i),rj s (j)), (21) 

and there are three cases to consider in the definition of the coupling at step s + 1: 



Case 1. Card i is at position at time s. 
Case 2. Card j is at position at time s. 
Case 3. Both of the cards i,j are not in position at time s. 



In Case 1, we take 



(a s+1 (i),a s+1 (j)) 



((7,0') +1,1) 



mod n w.p. — for all i ^ o~ s (j) + 1, 



mod n 



w.p. 



7 



and 



{Vs+i (i),Vs+i (i)) = < 



fo(j) + M) 

(»7-0') + i.»?.Cj) + i) 

&1) 



mod n w.p. for all i ^ ?7 s (j) + 1 

mod n w.p. -4-, 

mod n w.p. ^-^i, 

mod n w.p. 4-, 



Thus, given that the processes satisfy Mil at time s and that at that time card i is at location 0, we may couple the 



processes to satisfy J2 It at time s + 1 with conditional probability at least 



(n-l)- 1 



> 1 — — . Similarly, in Case 2, if the 



coupling satisfies J2 It at time s then ( I21> can be satisfied at time s + 1 with conditional probability at least 1 — 
In Case 3, the transition probabilities for the process a are given by 



(<T s+ l(i),(T s+1 (j)) 



(a s {i) + l,a s (j) + 1) mod n w.p. 



Mi) + 1,1) 

(l,a s (j) + l) 



mod n w.p. — , 
mod n w.p. — . 



and the transition probabilities for the process (77, 77) are given by 



mod n w.p. 1 — — + \. 



mod n w.p. — 
mod n w.p. A 
mod n w.p. -4 



2, 

n 

1 1 



(T7* (i) + 1, 1) 

I (1,1) 

It therefore follows that in Case 3, if the processes satisfy i2l\ at time s, they may be coupled to satisfy it at time s + 1 
with conditional probability 1 — -4-. 

It now follows that the probability that the processes "unglue" by time t (i.e., (12 U fails for some s < i) is at most 



2f 

7 ) 2 



(22) 



We now estimate the difference of expected values in (I20> . On the event where the processes satisfy d2 II at time t 
we get a contribution. On the complementary event we get a contribution bounded by 211/H^. We thus obtain (I20> 
by <E}. " ■ 



Since the processes r\ and 77 are independent, it follows from (II 3I > that 



E 



/Mi))/(&c?'))J - nfimmnfivtU))} = avcwc?) = iAi 2t /w/(i). 

Therefore, by Lemma l231 we obtain 
Corollary 2.6 For any two cards i 7^ j we have 



E 



(f(<T t (i))f(* t (j))\ < (|A| 



4t + 4niV ii (t) 



l/ll 



(23) 



We can now bound the second moment of F. 
Lemma 2.7 E [|F(a t )| 2 ] < (jA| 24 + ±^±^ 
Proof: We have 



l/ll 



Note that 



E [\ F (°t)\ 2 ] = ^E E [/(^(0)/(^o*))] /Cj)/(<) + ^E E D/m*))i 2 ] i/«i 5 

-1 2 E «)| 2 ] 



U)| 2 < 



i/u- 



(24) 



(25) 
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By Corollary |2.6l for any i ^ j, 



E 



f{°t(i))f{vm mm 



||A|2t + 4t + 4n^(t) ] 



(26) 



Inserting i25\ and J26i into J24i we obtain 

n + n 2 |A| 2t 



E [|F(a t )| 2 ] < ^£ 



■4t- 



4n 



< 



+ n 2 |A| 2t + 12<), 



using dl9l . This completes the proof. ■ 

2.5 The mixing time 

Given the bound on the second moment of our test function from the previous section, and the bound on the eigenvalue 
from section l2~2l it is straightforward to derive a lower bound on the mixing time. 

Proof of Theorem ll.ll Recall from Lemma l2~2l that the equation e z — z — 1 = has nonzero roots and let £ be such a 
root. By Lemma l2~2l it follows that for large n there exists a solution 7 of the equation (n — 1)7™ — n^ n ~ 1 + 1 = 
satisfying and 0. Fix 7 and let / be the corresponding eigen-function of M. Let F be the test function (f^l. 
Write p = n\l - A| and note that p = |C + 1| + 0(£) by 0. 

We use the test function F. Let /it be the distribution of a t in the cyclic-to-random shuffle where <jq is the identity 
permutation and recall that U denotes the (uniform) stationary measure on S n . Let g 2 be the density of p, t with respect 
to v = (/i t +U)/2. Let h 2 be the density of U with respect to v. 

By (|T3 and (JT5J we have that 



\\nf\\l = \E» t [F]-V u [F)\ = 
On the other hand, by Cauchy-Schwartz, 

r F(g + h)(g-h)dv 



Fg 2 dv ~ \ Fh 2 dv 



< / \F\ 2 (g + h) 2 dv- (g- hf dv 



Fg 2 dv- J Fh 2 dv 
By Lemma l2~3l and Lemma lzTl 

J\F\ 2 (g + h) 2 dv < 2j\F\ 2 g 2 dv + 2j\F\ 2 h 2 dv^2^ t [\F\ 2 )+2E u [\F\ 2 ) 

< ?MM. + 2l A 
n — 1 

Moreover, 



12t + n 



, ,II/IIS <2(|A|*+ 12 ' + 8 " 



l/ll 



J{g- h) 2 dv< J \g 2 - h 2 \ dv = 2\\fH ~ U\\tv. 
Recalling Lemma l2~4l we conclude that 

|A| 2 1/||| .,/ |A|-' 



IfM, -W||tv > 



4||/||^(|A| 



2t , 12t+3n> 




2t 1 12t+3n 



It follows that \\nt -U\\tv = ^(1) when = 0(1). Note that if t > n, then = oi^^) and that if 

t=-n (log n — log log n — b) = ~~~^~^~j^~ n Q-°6 n — 1°S 1°§ n — b) , 



then 



n 2 |A| 2 * 

The proof of the theorem follows. 



(l + 0(l/n)) 



nlogn 
pn log ne 6 



l + 0(l/n) 
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3 An upper bound for general semi-random transpositions 



In this final section we prove Theorem ll.2l 

Proof: By the triangle inequality it suffices to prove the theorem assuming that the Lt are deterministic. We thus 
restrict to that case. 

We define a strong uniform time for the shuffle, i.e., a stopping time T with the property that, given T = t, the 
random permutation u* t has the uniform distribution over S n . It is well known (see, e.g., |3|) that, if T is a strong 
uniform time, then the distribution [i* t of of satisfies 

||A*t-«||TV<P[T>t] Vt. 

Following Broder (as described in 1 1 1 J) and Mironov 1 17 1, we define the stopping time in terms of a card marking 
process as follows. Initially all cards are unmarked. First, the card initially at L\ is marked. Later, at time t, we mark 
the card at L t if it is unmarked and the card at R t is already marked, and also if R t = L t and this location has an 
unmarked card. Once a card is marked it remains so at all future times. Set T to be the first time t at which all cards 
are marked. Clearly T is a stopping time. The theorem follows immediately from the following two claims: 

Claim 1: T is a strong uniform time. 

Claim 2: There exists Cq < oo such that for any C\ > Cq we have 



'(T > dnlogn 



for some (3 = /3(Ci) > 0. Specifically, this holds for C* = 320~ 3 + (T \ where 9 = e~ 2 (l - e" 1 )/2. 

Proof of Claim 1: By induction, it is easy to check the following. At any time t, given that k cards have been marked, 
conditional on the set of marked cards and their locations, the mapping between these two sets (assigning to every 
marked card its location) is uniformly distributed among the k\ possibilities. See 1171 or 11 II for details. 

Proof of Claim 2: Divide time into successive epochs of length In, starting after the card at L\ is marked. Denote by 
u k the fraction of unmarked cards before epoch k, so u\ = 1 — 1/n. Let m k = 1 — u k . Let Ti k denote the history of 
the process prior to epoch k, and note that u k is a function of H k . 

Claim 3: F,(u k+1 \H k ) <u k [l- 29m k ] for all k, where 6 = e~ 2 (l - e _1 )/2. 

Proof: Consider a card x, unmarked before epoch k. Of the 2n prescribed locations {L t } in the epoch, at most n are 
their last occurrence in the epoch. Thus for 1 < j < n we can find t(j) < s(j) in the epoch such that Lftj) = L s ^y 
For each j < n, we have Rt(j) = x with probability 1/n. Therefore, the event A x that there exists a j < n satisfying 
Rt(j) = x, has probability 

V{A x \H k )>l-{l-l/n) n >l-e-\ 

On A x , we fix j to be minimal such that Rt(j) = x. Given A x and Hk, with probability at least (1 — 1/n) 2 ™ -2 > e~ 2 , 
we have R t ^ -^t(j) for all t such that t(j) < t < s(j). In that case, x is untouched by the random choices between 
times t(j) and s(j), and then with probability at least m k the card at R s ij) is one of the nm k cards marked prior to 
epoch k. Thus x gets marked with probability at least 2dm k . The assertion of Claim 3 follows. ■ 

Proof of Claim 2 continued: Using Claim 3, we first quantify the time to mark at least half the cards (i.e., to achieve 
rrik > 1/2), and then the time to mark the remaining cards (i.e., to achieve u k < 1/n). Denote by D k the number 
of cards that get marked during epoch k as a result of being transposed with a card that was marked prior to epoch k. 
Clearly mu+i > m k + Du/n. The proof of Claim 3 implies that 

(i) ifm k < 1/2, then E(D k \Hk) > 0nm k ; 

(ii) ifm k > 1/2, then E(u k+1 \H k ) < (1 - 6)u k . 

To bound the number of epochs where m k < 1 /2, we need a stochastic lower bound for D k : 
Claim 4: If m k < 1/2, then 



'(p k > 



9nm k 



J2 



n k ) > 



10 



Proof: Using the notation in the proof of Claim 3, Denote by Dk the number of j < n such that R s (j) is one of the 
nrrik cards marked prior to epoch k. Clearly Dk < Dk- The distribution of Dk is Binomial(n, TOfe), and this also 
holds given Hk- Therefore, 

E(£>fe|H fc ) < V{Dl\Hk) < {nmkf + nmk < 2{nm k ) 2 . 
In conjunction with (i) above, this yields 

E(-Dfc|Wfc) < C 2 E(D k \Hk) 2 , 
where C2 = 29~ 2 . A standard second moment bound (see, e.g., 1161 p. 8]) now yields Claim 4. ■ 

Proof of Claim 2 concluded: Call epoch k a "growth epoch" if rrik+i > (1 + 9/2)mk- Call epoch k a "good epoch" 
if it is a growth epoch or it satisfies vtik > 1/2. Claim 4 implies that the conditional probability that epoch k is a good 
epoch, given Hk, is at least 2 /8. Thus the number of good epochs among the first k% = C3 log n epochs stochastically 
dominates a Binomial(fc 3 , 9 2 /8) random variable. Fix C3 > 326> -3 , and denote by ^3 the event that there are at least 
(4 log n) /9 good epochs among the first k$ epochs. Recall that the probability that a binomial random variable differs 
from its mean by a constant multiple of the mean decays exponentially in the number of trials k$ — C3 log n. We infer 
thatP(ng) < n- /3 /2for some /3 > 0. Since (1 + 9/2) 4 / 8 > e and mi = 1 /n, the number of growth epochs must be 
smaller than (4 log n) j 9. Thus on f2 3 we have mj, 3 > 1/2. 

Turning now to the second portion, once mu > 1/2 we have from (ii) above that E(itfc+i|ufc) < (1 — 9)ut- 
Therefore for all k > 0, we have 



E(ufc3 +fc |fi 3) Ufc3) < (l-9) k u k3 < 



e 



-Ok 



2 

Thus if k = (1 + /3)9- 1 logn, then 



P(uk 3 +k > V n I ^3) < E(nw fc3+fc I tt 3 ) < n 

In conjunction with the bound for P(il§), this implies that P(uk 3 +k > 1/") < n - ' 3 for this value of fc. In other words, 
if Ci > C = 326»~ 3 + and ki = Ci logn, then there exists /3 = /3(Ci) > such that P(u fel > 1/n) < n^* 3 . 
This completes the proofs of Claim 2 and of the theorem. ■ 



4 Concluding remarks and further problems 

1. We have shown that the cyclic-to-random transposition shuffle on n cards has mixing time of order 0(nlogn). 
However, the constant in our general upper bound, and that in the specific cyclic -to-random upper bound of 
Mironov 1171 . are significantly larger than the constant in our lower bound. We believe that the lower bound is 
closer to the truth, and moreover, that this shuffle exhibits the "cutoff phenomenon", i.e., there is a constant C* 
such that for £ < (1 — e)C»n logn the distribution after t steps, /ijf, satisfies H/^ — W||tv = 1 — o(l) as n — * 00, 
while for t > (1 + e)C*nlogn we have — U\\tv = o(l) as n — * 00. Proving this, and determining C*, 
remain a challenge. 

2. Does the cyclic-to-random shuffle capture the key features of the RC4 cryptographic system, as suggested by 
Mironov (HI? 

If the answer is positive, then we expect that the test function F defined in il Q may play a role in future analysis 
ofRC4. 

3. For which sequence {L t } does the resulting semi-random transposition shuffle on n cards have the largest 
mixing time? 

We suspect that the slowest shuffle in this class is the "star transpositions" shuffle, for which L t — for all t, 
and the mixing time is (1 + o(l))nlogn by 1 10|. 
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4. Is there a universal constant c > such that, for any semi-random transposition shuffle on n cards, the mixing 
time is at least an log n ? 

For this lower bound question there is no obvious reduction to the case where the sequence { L t } is deterministic, 
so conceivably the question could have different answers for deterministic {L t } and random {L t }. Two specific 
cases of interest are: 

• For each k > 0, let {£fc n +r}r=i be a uniform random permutation of {0, . . . , n — 1}, where these permu- 
tations are independent. 

• Let {L t } be a Markov chain with memory 2, where L\ — 0, L2 = 1 and for each t > 3 we have 
Lt+i = 2L t — Lt-x mod n with probability 1 — 1/n and L t +i = Lt-\ with probability 1/n. This choice 
of {L t } was suggested to us by Igor Pak (personal communication), motivated by |9 1. 

Each of these examples has a "quenched" version, where the sequence {L t } is picked in advance and then used 
as a deterministic sequence, and an "annealed" version, where the {L t } are random variables with the specified 
distribution. 

Acknowledgments: We are grateful to Serban Nacu for his help with some of the complex analysis used in this paper. 
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